Skip to content

Runbook for new Ephemeral Hotplug Volume Metric#328

Merged
sradco merged 1 commit intokubevirt:mainfrom
Dsanatar:add-ephemeral-hotplug-runbook
Dec 16, 2025
Merged

Runbook for new Ephemeral Hotplug Volume Metric#328
sradco merged 1 commit intokubevirt:mainfrom
Dsanatar:add-ephemeral-hotplug-runbook

Conversation

@Dsanatar
Copy link
Contributor

@Dsanatar Dsanatar commented Nov 25, 2025

What this PR does / why we need it:

Adding new runbook for Alert that is to be merged in kubevirt. Alert is fired when a VMI contains an ephemeral hotplug volume. Runbook is to notify user of future deprecation of the feature and to advise steps on how to convert these ephemeral volumes into persistent volumes.

Kubevirt PR that adds new metric/alert: kubevirt/kubevirt#15815

Jira: https://issues.redhat.com/browse/CNV-69387

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note:


@kubevirt-bot kubevirt-bot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Nov 25, 2025
@Dsanatar Dsanatar force-pushed the add-ephemeral-hotplug-runbook branch from 6604025 to 734882a Compare November 25, 2025 19:50

To diagnose the cause of this alert, the following steps can be taken:

1. Find the affected VM(s) and volume(s) that are reported in the Alert.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Find the affected VM(s)

@Dsanatar may want to how to list all vmis with matching label selector that was added here kubevirt/kubevirt#15815

Copy link
Contributor Author

@Dsanatar Dsanatar Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some changes today that make use of an annotation instead of the previous label approach so that we can store the list of volume names without having to worry about the character limit for label values. I figured this would make things easier for when it came time to patching the specs. That being said, since we can't use a label selector, do you think a jq command to filter vms that have the specific annotation would suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out though, I added the new annotation so that it could be queried and used here. will fix

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that the label was just a flag so the value would be true or even empty. Downside of annotations is that have to get all vms and filter on the client side rather than having the server filter. Not sure the convenience of having the volume names in an annotation is worth that cost. Let's see what other reviewers think

Copy link
Contributor Author

@Dsanatar Dsanatar Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea true, there's definitely a tradeoff here. Before I changed it to an annotation, I was toying around with a script (or a list of commands) we could provide in the runbook to more easily grab all the affected volumes for each vm but the logic was becoming very similar to what we are already doing in the virt-controller so I figured I could just leverage the existing logic instead. If we were to revert back to the label selector method, do you think providing the list impacted vms and instructing users to compare the two specs to find the volumes would be enough?

Copy link
Collaborator

@sradco sradco Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the status of this. @mhenriks do you approve the runbook?

@Dsanatar Dsanatar force-pushed the add-ephemeral-hotplug-runbook branch 2 times, most recently from 2cc6c2a to 4793acd Compare December 8, 2025 17:49
@sradco
Copy link
Collaborator

sradco commented Dec 14, 2025

@Dsanatar please add the Jira issue to the description

Copy link
Contributor

@akalenyu akalenyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the runbook looks good to me


Once the VM is patched, you can remove the Alert by running:
``` bash
$ kubectl annotate vmi <vmi-name> kubevirt.io/ephemeral-hotplug-volumes- --overwrite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this get removed automatically by VM controller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I admit this feels a bit hacky. The reason I have this here is because the VM object that is being used to compare volumes against the VMI doesn't seem to get updated to show the new hotplug volume once they are patched to be persistent (the object only shows the new volumes after VM restart for some reason). Perhaps there is a bug in how i'm retrieving the VM from cache?

https://github.com/Dsanatar/kubevirt/blob/44c2ed1cd22787cd21e848ff3d4d78d2a457cb0d/pkg/virt-controller/watch/vmi/lifecycle.go#L231-L245

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this seems like a bug. I think that function looks okay. Have you traced it and it's returning nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this mitigation step, will update kubevirt pr to remove the annotation at the controller level

@Dsanatar Dsanatar force-pushed the add-ephemeral-hotplug-runbook branch from 4793acd to a8e844d Compare December 15, 2025 17:27
@mhenriks
Copy link
Member

/lgtm

Signed-off-by: dsanatar <dsanatar@redhat.com>
@Dsanatar Dsanatar force-pushed the add-ephemeral-hotplug-runbook branch from a8e844d to 8846e25 Compare December 16, 2025 16:34
@sradco sradco merged commit a4a4a07 into kubevirt:main Dec 16, 2025
2 checks passed
github-actions bot pushed a commit that referenced this pull request Dec 16, 2025
…hotplug-runbook

Runbook for new Ephemeral Hotplug Volume Metric
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has DCO signed all their commits. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants